Segmentation and Recognition of Printed Arabic Characters
نویسندگان
چکیده
Arabic characters differ significantly from other characters such as Latin and Chinese characters in that they are written cursively in both printed and handwritten forms and consist of 28 main characters. However most of their shapes change according to their position in the word. These shapes together with some other secondaries raise the number of classes to 120. Furthermore, some of these characters have the same shape but are distinguished by the presence of one, two or three dots above or below them. In this paper words are first segmented into characters and secondaries are removed using novel algorithms. This reduced the classes to 32 out of 120 different classes. Information about these secondaries such as their number, position and type are recorded and used in the final recognition stage. Features of the skeletonised character are used for classification using a decision tree. A recognition rate of 97.23% is achieved.
منابع مشابه
Printed Arabic optical character segmentation
A considerable progress in recognition techniques for many non-Arabic characters has been achieved. In contrary, few efforts have been put on the research of Arabic characters. In any Optical Character Recognition (OCR) system the segmentation step is usually the essential stage in which an extensive portion of processing is devoted and a considerable share of recognition errors is attributed. ...
متن کاملAn Adaptive Algorithm for the Automatic Segmentation of Printed Arabic Text
Character segmentation is a crucial step in most Arabic optical text recognition systems. The recognition process depends mainly on the accuracy of the character segmentation. This paper presents a novel adaptive algorithm for the off-line segmentation of printed Arabic text. There are many challenging features in the Arabic writing, for example, it is cursive and characters in a word can take ...
متن کاملA Survey on Arabic Character Recognition
Off-line recognition of text play a significant role in several application such as the automatic sorting of postal mail or editing old documents. It is the ability of the computer to distinguish characters and words. Automatic off-line recognition of text can be divided into the recognition of printed and handwritten characters. Off-line Arabic handwriting recognition still faces great challen...
متن کاملArabic Character Segmentation Using Projection Based Approach with Profile's Amplitude Filter
Arabic is one of the languages th challenges to Optical character recognition ( challenge in Arabic is that it is mostly curs segmentation process must be carried out character’s start and end. This step is essen recognition. This paper presents Ar segmentation algorithm. The proposed alg projection-based approach concepts to separ and characters. This is done using profile's and simple edge to...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملAutomatic Arabic Hand Written Text Recognition System
Despite of the decent development of the pattern recognition science applications in the last decade of the twentieth century and this century, text recognition remains one of the most important problems in pattern recognition. To the best of our knowledge, little work has been done in the area of Arabic text recognition compared with those for Latin, Chins and Japanese text. The main difficult...
متن کامل